Goto

Collaborating Authors

 de-biasing classifier


Review for NeurIPS paper: Learning from Failure: De-biasing Classifier from Biased Classifier

Neural Information Processing Systems

Weaknesses: 1. LfF does use human knowledge; please do not claim that it doesn't. The paper criticises prior works on the de-biasing problem for using "domain-specific knowledge" or "explicit supervision" on the suprious correlated attributes, while claiming their methods to be designed for scenarios where "such information is unavailable". I strongly disagree with this bold claim. LfF heavily depends on the assumption that the quickly-learned cues (so-called "malignant biases") are the undesired biases that hinders generalisation. Do quickly-learned cues **always** correspond to undesired set of biases?


Learning from Failure: De-biasing Classifier from Biased Classifier

Neural Information Processing Systems

Neural networks often learn to make predictions that overly rely on spurious corre- lation existing in the dataset, which causes the model to be biased. While previous work tackles this issue by using explicit labeling on the spuriously correlated attributes or presuming a particular bias type, we instead utilize a cheaper, yet generic form of human knowledge, which can be widely applicable to various types of bias. We first observe that neural networks learn to rely on the spurious correlation only when it is "easier" to learn than the desired knowledge, and such reliance is most prominent during the early phase of training. Based on the obser- vations, we propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our main idea is twofold; (a) we intentionally train the first network to be biased by repeatedly amplifying its "prejudice", and (b) we debias the training of the second network by focusing on samples that go against the prejudice of the biased network in (a).